Added the basic TDNN+LSTM scripts#1205
Conversation
|
This is great news as it means we can do online decoding with less latency. On Mon, Nov 21, 2016 at 4:29 PM, Vijayaditya Peddinti <
|
|
The latency of the TDNN+LSTM models right now is 21 frames, less than half The current set of experiments being run are designed to find the optimal --Vijay On Mon, Nov 21, 2016 at 4:32 PM, Daniel Povey notifications@github.com
|
|
I added the results for the TDNN+LSTM setup. As expected they look better than the TDNN and LSTM results. Further TDNN+LSTM without pretraining is working better than TDNN+LSTM with it. (Please note that we saw improvements due to removal of layer-wise pretraining in TDNNs but not in LSTMs). #System lstm_6j tdnn_7h this(old) this(new)
#WER on train_dev(tg) 14.66 13.84 13.88 13.42
#WER on train_dev(fg) 13.42 12.84 12.99 12.42
#WER on eval2000(tg) 16.8 16.5 16.0 15.7
#WER on eval2000(fg) 15.4 14.8 14.5 14.2
#Final train prob -0.0824531-0.0889771-0.0515623-0.0538088
#Final valid prob -0.0989325 -0.113102-0.0784436-0.0800484
#Final train prob (xent) -1.15506 -1.2533 -0.782815 -0.7603
#Final valid prob (xent) -1.24364 -1.36743 -0.946914 -0.949909BLSTM training is still running. I might have the results in a day. The below plots compares the log-likelihood values for the TDNN+LSTM and BLSTM setups. |
| self.config = { 'input':'[-1]', | ||
| 'dim':-1, | ||
| 'max-change' : 0.75, | ||
| 'bias-stddev' : 0, |
There was a problem hiding this comment.
we now support ng-affine-options, so it is no longer necessary to explicitly support *stddev options.
06efa23 to
7e630dd
Compare
| @@ -0,0 +1,110 @@ | |||
| # Copyright 2016 Johns Hopkins University (Dan Povey) | |||
|
OK with me. On Tue, Nov 22, 2016 at 8:13 PM, Vijayaditya Peddinti <
|
1. Added TDNN+LSTM recipe which performs similar to BLSTM model with significantly smaller latency (21 frames vs 51 frames). 2. Added BLSTM results in xconfig setup, without layer-wise discriminative pre-training (2.7% rel. improvement) 3. Added an example TDNN recipe which uses subset of feature vector from neighboring time steps (results pending). xconfig : Added a tdnn layer which can deal with subset-dim option.
5563164 to
34d75fe
Compare
|
ready for review. |
|
Looks good to me, merge when you want. |


The TDNN+LSTM was found to be as good as the BLSTM architecture in previous experiments (not checked in). This PR has scripts to specify these models as xconfig scripts.
I will add the results in a day.